Throttling CPU utilization by implementing a rate limiter

ABSTRACT

An approach for a hypervisor to throttle CPU utilization based on a CPU utilization throttling request received for a data flow is presented. A method comprises receiving a request for a CPU utilization throttling. The request is parsed to extract a CPU utilization level and a data flow identifier of the data flow. Upon receiving a data packet that belongs to the data flow identified by the data flow identifier, a packet size of the data packet is determined, and a rate limit table is accessed to determine, based on the CPU utilization level and the packet size, a rate limit for the data packet. If it is determined, based at least on the rate limit, that the CPU utilization level for the data flow would be exceeded if the data packet is transmitted toward its destination, then a recommendation is generated to drop the data packet.

BACKGROUND

Managing data traffic communications exchanged along datapaths isusually performed in a best-effort mode. In the best-effort mode, CPUresources are provisioned for processing data flows based on theresources' availability. Therefore, it is possible in this mode that thedata flows that communicate at high rates may absorb all CPU resources.This in turn, may prevent other applications or workloads, such as guestvirtual machines and system processes and threads that execute on thesame physical server, from receiving required CPU resources for sometime.

The situation may become even more problematic when a distributednetwork encryption (DNE) mechanism is enabled on datapaths. Due to thehigh CPU cost required to encrypt and decrypt data packets along thedatapaths, the data traffic that requires the DNE may consume all CPUresources, and leave no CPU resources for executing guest virtualmachines and system processes/threads. This may delay or even stallexecution of those processes and threads.

In some situations, impeding the execution of guest virtual machines andsystem processes/threads may slow down the processing of time-sensitivedata traffic that may be deadlocked as it waits for the systemprocesses/threads to complete their tasks. Unfortunately, thebest-effort mode does not provide sufficient mechanisms for reducing theamount of CPU resources already allocated to the data traffic to allowother applications or workloads, such as guest virtual machines, toreceive the CPU resources.

SUMMARY

Techniques are described herein for throttling CPU utilization by dataflows in a computer virtual network. The throttling of the CPUutilization is initiated in response to receiving a CPU throttlingrequest, also referred herein as a request. The request may be receivedfrom a user or an administrator who executes for example, a virtualmachine that generates data traffic in the virtual network, but whorequests less than 100% of the CPU resources for the processing of thedata traffic so that other tasks and applications can also receive someof the CPU resources.

A CPU throttling request may be received by a hypervisor. Alternatively,the request may be received by a control plane first, and thencommunicated from the control plane to the hypervisor.

In an embodiment, a request received from a user or an administrator mayspecify a desired CPU utilization level for processing data traffic.Typically, the desired CPU utilization level included in the requestwill be less than 100% so that the remaining CPU resources (notallocated to the traffic processing) may be allocated to other tasks andapplications.

A request may also include a data flow identifier of the data flow towhich the request is to be applied. The data flow identifier may specifythat all data processed at the host is throttled according to thedesired CPU utilization level. Alternatively, the data flow identifiermay specify a particular VM for which all data processed at the hostthat is generated by or is directed to the particular VM is throttledaccording to the desired CPU utilization level. Finally, the data flowidentifier may be a destination MAC address of data packets that belongto the data flow. The desired CPU utilization level included in therequest may be applied to all data packets that belong to the data flow,and a hypervisor will guarantee no more than the desired CPU utilizationlevel for processing the data packets that belong to the data flow.

To guarantee a desired CPU utilization level for processing datatraffic, the hypervisor determines, on a per-packet-basis, whether apacket is to be dropped or accepted and thus transmitted to itsdestination. For example, if transmitting a detected data packet towardits destination would require exceeding the desired CPU utilizationlevel requested by the user or the administrator for the data flow, thena hypervisor may recommend dropping the packet. Otherwise, the datapacket may be transmitted toward its destination.

However, if a user or an administrator sent a request that specifies adesired CPU utilization level for a data flow as 100%, or if the user orthe administrator did not send a CPU utilization request for the dataflow at all, then the CPU resources will be assigned for processingtraffic in a best-effort mode.

Upon receiving a CPU throttling request, the hypervisor parses therequest to extract a CPU utilization level, and a data flow identifier.

Upon detecting a data packet that belongs to the data flow, thehypervisor determines a size of the data packet and whether the datapacket is encrypted. Based on the size of the data packet and the CPUutilization level, the hypervisor accesses either a rate limit table forencrypted data packets or a rate limit table for unencrypted datapackets.

Based on the packet size, the CPU utilization level, and the informationincluded in the corresponding rate limit table, the hypervisordetermines a particular rate limit value for the data packet. Then, thehypervisor invokes and executes a rate limiter function using theparticular limit value to determine recommendations for either droppingor accepting the packet. The recommendations are derived by analyzingthe content of a so-called bucket associated with the CPU resourcededicated to processing the data flow. The concept of the bucket isdescribed in detail in FIG. 2 .

The approach may be implemented along any part of a datapath of the datapacket communications. This may include implementing the approach at apoint of the datapath that is between a virtual machine and a virtualswitch that detects data packets on virtual ports of the switch. Thismay also include implementing the approach along the datapath that isinternally processed by the virtual switch. Furthermore, this mayinclude implementing the approach at a point of the datapath that isbetween hardware components and a hypervisor that detects data packetson uplinks configured to support communications with a physical network.

BRIEF DESCRIPTION OF THE DRAWINGS

In the drawings:

FIG. 1 is a block diagram depicting an example system architecture forthrottling CPU utilization;

FIG. 2 is an example flow chart for throttling CPU utilization byimplementing a rate limiter;

FIG. 3A is a block diagram for determining whether to allocate CPUresources to tasks based on a CPU throttling mode or a default mode;

FIG. 3B is an example flow chart for determining a rate limit in a CPUthrottling mode;

FIG. 4 depicts an example rate limit table;

FIG. 5 depicts pseudocode representing an algorithm for determiningwhether to accept or drop a detected packet based on a current bucketsize, an arrival time, a last arrival time, a packet size, and a ratelimit.

DETAILED DESCRIPTION

In the following description, for the purposes of explanation, numerousspecific details are set forth in order to provide a thoroughunderstanding of the method described herein. It will be apparent,however, that the present approach may be practiced without thesespecific details. In some instances, well-known structures and devicesare shown in a block diagram form to avoid unnecessarily obscuring thepresent approach.

1. Example System Architecture for Implementing a CPU UtilizationThrottling

FIG. 1 is a block diagram depicting an example system architecture forimplementing a CPU utilization throttling. The depicted systemarchitecture 100 includes one or more hosts 110, and one or morephysical networks 190.

Host 110 may include a hypervisor 160, a data repository 150, and otherhardware components 175. Hypervisor 160 may include a virtual switch 140that may provide connectivity to and from one or more virtual machines,such as VM1 120 and VM2 122. Hypervisor 160 may use uplinks 170, 172 toprovide connectivity between the virtual machines and physical network190.

Hypervisor 160 may be configured to execute programmable instructions toimplement a CPU utilization throttling. For example, hypervisor 160 maybe configured to receive a CPU throttling request from a user or anadministrator who executes for example, a virtual machine or workloadthat generates or receives data traffic in host 160 or that is affectedby another application or workload that is generating or receiving thedata traffic. The request may provide a CPU utilization level for thedata-traffic-related to executing the virtual machine. It is assumedthat the particular CPU utilization level for the data-traffic-relatedtasks is less than 100% so that non-traffic-related tasks can receivethe remaining portion of the CPU resources.

However, if the request specifies the desired CPU utilization level fora data flow as 100%, or if no request has been received for the dataflow from a user or an administrator, then hypervisor 160 may assign theCPU resources to the traffic-processing related tasks relying on abest-effort mode. An example decision process for determining whether touse the CPU utilization throttling mode or the best-effort-mode isdescribed in FIG. 3A.

Hypervisor 160 may receive a CPU throttling request directly from a useror an administrator, e.g., by accessing a management portal exported bythe hypervisor, or from a control plane (not depicted in FIG. 1 ). Thecontrol plane may comprise one or more controllers (computers that mayoperate in a cluster) which may receive direction from a managementplan, which itself may comprise a user interface accessible by a user oran administrator, or from automated orchestration software. In anembodiment, the throttling request may pertain to the data traffic thatis generated by a virtual machine. If the throttling request pertains tothe data traffic that is generated by the virtual machine, then thethrottling is applied to data packets that are detected by a virtualswitch. In another embodiment, the throttling request may pertain to thedata traffic that is directed toward a virtual machine. If thethrottling request pertains to the data traffic that is directed towardthe virtual machine, then the throttling is applied to data packets thatare detected by a physical network interface card.

Hypervisor 160 takes the CPU utilization request from the administratorand enforces the traffic processing load not to exceed the specified CPUutilization. For example, if the user or the administrator sent arequest for the CPU utilization throttling for communicationstransmitted from his virtual machine to VM2 122, then upon receiving therequest, hypervisor 160 will throttle the CPU allocation for processingdata traffic based on the CPU utilization level included in the request.Specifically, hypervisor 160 will extract the CPU utilization level fromthe request, and use the CPU utilization level, along with the size ofthe packet being processed, to determine a rate limit. The rate limitfunction is then placed and being executed on the entrance into Tx datapath (vNIC in the sending host) and into Rx data path (pNIC in thereceiving host). Based on the decision (ACCEPT or DROP) made by the ratelimit function, the packet will be accepted into or dropped from thedata path processing.

Hypervisor 160 may determine a rate limit for the data flow once a datapacket of the data flow is detected. The packet may be detected at anyendpoint of a datapath 111 used to communicate the data flow. The ratelimit is determined based on for example, the CPU utilization levelextracted from the CPU throttling request, a size of the detected datapacket, and determining whether the detected packet is encrypted or not.Details of determining the rate limit are described in FIG. 3B.

Hypervisor 160 uses the rate limit to determine, based on aper-packet-basis, whether the detected packet is to be dropped ortransmitted toward its destination.

For example, upon receiving a CPU throttling request from a user or anadministrator, hypervisor 160 determines a data flow to which therequest pertains, extracts the CPU utilization level value from therequest, and uses the CPU utilization level value to determine a ratelimit that is required to provide no more than the requested CPUutilization level for processing the data flow.

Hypervisor 160 may compute a rate limit using different approaches. Theprocess of computing the rate limit may depend on whether data packetscommunicated within a data flow for which the CPU throttling request wasissued are encrypted or not. Examples of computing the rate limit forencrypted data packets are described in FIG. 4 . Once the rate limit isdetermined, hypervisor 160 executes a rate limiter.

A rate limiter is a computer program or a computer function that isconfigured to manage an abstract bucket associated with the CPUdedicated to the handle the data flow for which the request was issued.The rate limiter may be implemented in hypervisor 160 or in a library offunctions available to hypervisor 160.

A bucket is an abstract data structure associated with a CPU resource.It may be managed using a counter that is originally initiated with adefault count of tokens, and then either decreased or increased as datapackets of the data flow are processed using the CPU. An example processof managing the bucket is described in FIG. 2 .

In an embodiment, hypervisor 160 executes a rate limiter to generate arecommendation on whether to drop or accept a detected data packet thatbelongs to the data flow for which a CPU throttling request was receivedfrom a user or an administrator. The rate limiter may generate arecommendation to drop the data packet when the processing of the datapacket would have placed too heavy demands on the CPU to maintain therequested CPU utilization level. If the data packet is dropped, then noadditional demands are placed on the CPU. Thus, the CPU utilization mayremain at, or below, the CPU utilization level specified in the request.

On the other hand, the rate limiter may generate a recommendation toaccept the data packet when the utilization of the CPU dedicated toprocessing the data packet will remain below, or at, the CPU utilizationlevel specified in the request. An example process executed by the ratelimiter is described in FIG. 2

To determine whether a received data packet is to be accepted ordropped, hypervisor 160 may use not only the information included in theCPU utilization throttling request, but also data stored in for example,a data repository 150. Data repository 150 may store data tables 203 forcomputing rate limits for unencrypted data packets, data tables 204 forcomputing rate limits for encrypted data packets, a bucket capacityvalue 205, a last arrival time 206, a current bucket size 207, and ahyperlink for a rate limiter function 208. The data stored in datarepository 150 is described in detail in FIG. 2 and FIG. 3B.

Hypervisor 160 may use an uplink UP1 170 and an uplink UP2 172 toprovide connectivity between virtual switch 140 and physical network(s)190. An uplink is a virtual port of the virtual switch that connects toa physical Ethernet adapter.

Virtual switch 140 may be configured to monitor and manage data trafficsthat are communicated to and from hypervisor 160. Virtual switch 140 maybe implemented as a kernel component of hypervisor 160, or as an entitythat is separate from hypervisor 160, but that communicates withhypervisor 160.

Virtual switch 140 may be configured to detect, on a virtual port VP1130 and a virtual port VP2 132, outgoing (egress) data packets receivedfrom VM1 120 and/or VM2 122, respectively. Implementations of virtualswitch 140 may vary and may depend on a type of product in which theswitch is deployed as a virtualization medium. For example, virtualswitch 140 may be implemented as part of hypervisor 160, as it isdepicted in FIG. 1 . Alternatively, although not depicted in FIG. 1 , avirtual switch may be implemented within user space, or within aprivileged virtual machine that has exclusive control over physicalresources such as PNICs 180, 182.

FIG. 1 depicts that host 110 hosts virtual machines VM1 120 and VM2 122.However, host 110 may host any number of virtual machines. VM1 120 andVM2 122 may be instantiated as virtualized computing instances. Theinstances may be equipped with their own resources, may be assignedtheir own workloads, and may be configured to perform their own tasks.Virtual resources allocated to the virtual machines may include virtualCPUs, virtual memory, virtual disks, virtual network interfacecontrollers and the like. VM1 120 and VM2 122 may be configured toexecute guest operating systems and guest applications.

Hardware components 175 may include hardware processors, memory units,data storage units, and physical network interfaces such as networkinterface controllers (a PNIC 180 and a PNIC 182) that may provideconnectivity to routers and switches of physical networks 190.

Physical networks 190 may include local area networks and/or wide areanetworks, and may utilize various hardware and software configurations.For example, physical networks 190 may include one or more routers, oneor more switches, one or more switch ports, and other datacommunications and processing components.

2. Example Rate Limiter

A rate limiter is a function executed by a hypervisor to determine,based on a rate limit set for processing data packets, whether to dropor accept a data packet. A rate limiter may be implemented as asoftware-based function that is invoked for each data packet, or someplurality of data packets, of the data flow detected by hypervisor 160along a datapath and for which hypervisor 160 received a CPU throttlingrequest.

Upon receiving a CPU throttling request, hypervisor 160 extracts, fromthe request, a data flow identifier for the data flow and a CPUutilization level desired for the data flow. Hypervisor 160 uses theextracted information and the data stored in data repository 150 todetermine a rate limit for processing a detected data packet thatbelongs to the data flow. The rate limit is determined periodically, oreach time a data packet of the data flow is detected. Since the packetsof the same data flow may have different sizes, and the rate limitdepends, among other things, on a size of the packet, the rate limitermay determine a rate for each detected packet individually. Based on atleast the rate limit, the rate limiter determines a recommendationwhether to drop or accept the detected packet. Examples of computing arate limit are described in FIG. 4 .

An output of the rate limiter is a recommendation for hypervisor 160 toeither accept or drop the detected data packet to guarantee that no morethan the desired data processing CPU utilization level, requested forthe data flow in the CPU throttling request, is maintained forprocessing the data flow.

A recommendation may state that the detected data packet is either to beaccepted, or to be dropped. For example, if the rate limiter determines,based on the rate limit, that the desired CPU utilization level for thedata flow would be exceeded if the data packet is accepted, then therate limiter generates a recommendation to drop the data packet.However, if the rate limiter determines, based on the rate limit, thatthe desired CPU utilization level for the data flow would not beexceeded if the data packet is transmitted toward its destination, thenthe rate limiter generates a recommendation to accept the data packetand the data packet is transmitted toward its destination.

3. Example Bucket

To throttle CPU resources according to a CPU throttling request receivedfrom a user or an administrator, a rate limiter may implement a conceptof a token bucket rate limiter.

A token bucket is a data structure used to store a count of tokens,which may vary from time to time. In one embodiment, the count of tokensindicates availability of a CPU resource dedicated to processing theuser's or the administrator's data flow at a given time.

The token count is managed to determine whether to drop or accept a datapacket detected along a datapath. A count of tokens is used to determinewhether a detected data packet, that belongs to the data flow for whicha CPU throttling request was issued, is to be dropped or accepted. Toaccept a detected data packet that has a particular size, the bucketneeds to have a sufficient count of tokens. In one embodiment, a tokenin the bucket corresponds to processing 1 byte of data by a CPU.

A bucket may be initiated with a default count of tokens. The defaultcount of tokens may be determined heuristically, or experimentally basedon experiments conducted for various data traffics.

To accept a detected data packet, a token bucket associated with the CPUprocessing the data packet needs to contain a sufficient count oftokens. The detected data packet is accepted only if there is asufficient count of tokens in the bucket to process the data packet, andif the processing of the data packet will not cause exceeding a CPUutilization level specified in a CPU throttling request. For example, toprocess a data packet that has a size of 512 bytes and to maintain a 1bytes/μsec processing rate as the packet is processed, the bucketassociated with the CPU dedicated to processing the data packet needs tohave at least 512 tokens.

A bucket described herein permits some packet burstiness because thetoken bucket allows processing a burst of the data packets as long asthe bucket contains a sufficient count of tokens to process the packets.

Tokens may be subtracted from a bucket. Suppose that, at the time a datapacket was detected along a datapath, the bucket stored 1000 tokens.Suppose also that a size of the detected data packet is 512 bytes. Thenthe bucket has enough tokens to recommend accepting the packet becausethe difference between 1000 and 512 is positive. Once the recommendationto accept the packet is conveyed to hypervisor 160, a count of tokensstored in the bucket is reduced by 512 because 512 tokens are requiredto process the packet. Therefore, upon recommending accepting thepacket, the count of tokens in the bucket is reduced.

However, in some situations, a bucket may not have enough tokens torecommend accepting the data packet. This may happen when a differencebetween a count of tokens stored in the bucket and a size of the datapacket is negative. In such a situation, a recommendation is generatedto drop the packet, and the count of tokens in the bucket remainsunchanged. Details pertaining to managing the bucket and tokens aredescribed in FIG. 2 .

Tokens may be added to a bucket periodically at the rate specified by arate limit. The rate limit is determined based on a CPU utilizationlevel extracted from a CPU utilization throttling request, and based ondata included in data repository 150. Once the rate limit is determined,the rate limit is used to determine whether, upon an expiration of someperiod of time, some tokens may be added to the bucket. Suppose that arate limit for processing data packets of a particular data flow is 25bytes/μsec, and the previously accepted data packet of the particulardata flow was accepted 2 μseconds ago. Then, it may be assumed thatabout 50 bytes (i.e., 25*2 seconds) of the previously accepted datapacket has been already processed. Therefore, 50 tokens (25*2 seconds)may be added to the token bucket.

In an embodiment, a bucket has an associated parameter called a bucketcapacity. The bucked capacity indicates the maximum number of tokensthat the bucket may store. The bucket capacity regulates the maximumtraffic burst that a rate limiter is able to tolerate. When the tokenbucket is full at its capacity, no additional tokens are added to thetoken bucket.

4. Example Flow Chart for Throttling CPU Utilization

FIG. 2 is an example flow chart for throttling CPU utilization byimplementing a rate limiter. The steps described in FIG. 2 may beexecuted by a hypervisor, or by a rate limiter invoked by thehypervisor. In an embodiment, the hypervisor may invoke the rate limiterto generate recommendations whether to drop or accept a data packet. Arate limiter is a computer program or a computer function that isconfigured to manage the token bucket associated with a CPU resourcesdedicated for processing data for which a CPU throttling request hasbeen received. The rate limiter is used to generate the packetaccept/drop recommendations.

For the purpose of describing the steps depicted in FIG. 2 , it isassumed that a CPU throttling request has already been received from auser or an administrator, and that the request is for a data flowgenerated or received as certain applications are on the host. The CPUthrottling request is a request for allocating no more than a certainpercentage of the CPU resource for processing the data generated orreceived at the host. The certain percentage of the CPU resources to beallocated for processing the data is referred to as a CPU utilizationlevel value, and is typically specified in the request.

The CPU throttling request may also include a data flow identifier thatidentifies the data flow for which the request pertains. The data flowidentifier may be a value indicating that all data packet processing isincluded, a VM identifier that indicates all data packet processing forpackets issuing from and directed to a particular VM is included, or adestination MAC address indicating that data packets matching thedestination MAC are included in the throttling.

In step 210, a data packet is detected at a certain point along adatapath at an arrival time. Detecting a data packet may be interpretedas observing the data packet at a certain point along the datapath thatis used to communicate data packets of the data flow for which a CPUthrottling request has been received. For example, a hypervisor maydetect a data packet at a virtual port of a virtual switch along aportion of the datapath that is between a virtual machine and thevirtual switch. This may also include detecting a data packet along thedatapath that is internally processed by the virtual switch. This mayfurther include detecting a data packet at an uplink configured tosupport communications between the hypervisor and a physical network.

An arrival time is the time at which the data packet is detected at acertain point along the data path. The arrival time for the data packetis recorded or otherwise associated with the data packet.

In step 220, a hypervisor determines a packet size of the detected datapacket. This may be accomplished by processing a header of the packet,identifying a payload length in the header, and determining the packetsize as a sum of a length of the packet header and the payload length.For the purpose of explaining the rate limiter function, the packet sizewill be equated to a count of tokens that are required to accept thepacket and transmit the packet toward its destination.

Also in this step, the hypervisor determines a rate limit for thedetected data packet. A rate limit for a detected data packet indicatesan amount of data that a CPU dedicated to process the data packet canprocess within a certain time interval. An example of the time intervalis one second. A rate limit may be determined based on a CPU utilizationlevel included in the CPU throttling request received from a user or anadministrator, a size of the packet, and whether the packet isencrypted.

A CPU utilization level value indicates a CPU utilization level that isdesired by a user or an administrator for processing the data flowgenerated by executing for example, a virtual machine. The CPUutilization level value corresponds to the throttling level that thehypervisor will attempt to provide, but that will not be exceed. Indeed,it is possible that the hypervisor will provide less than the CPUutilization level to the user's or the administrator's data flow. Thismay allow assigning the remaining amount of the CPU resource tonon-traffic-related applications and/or system processes and tasks.

In step 230, the hypervisor determines whether an abstract bucketassociated with the CPU dedicated to processing the administrator's dataflow includes a sufficient count of tokens for processing the detecteddata packet. This may include comparing an adjusted current bucket sizewith the size of the detected data packet, and generating arecommendation to drop the packet if the packet size exceeds theadjusted current bucket size, or generating a recommendation to acceptthe packet otherwise.

An adjusted current bucket size may be determined as a minimum of 1) acapacity of the bucket, and 2) a sum a current bucket size and a countof tokens returned to the bucket between a last arrival time of apreviously processed packet and the arrival time of the detected datapacket.

A last arrival time is the time when a previous data packet of the dataflow was detected, whereas an arrival time is the time when the currentdata packet of the data flow was detected. The count of tokens returnedto the bucket represents a count of tokens that might have been returnedto the bucket because a certain count of bits of the previous datapacket might have been already processed by the CPU by the arrival time.

For example, suppose that a previous packet was detected at time t₁, hada size of 1024 bytes, and was accepted for processing by the CPU.Furthermore, suppose that a current data packet was received at time t₂,and also has a size of 1024 bytes. Note, however, that the previouspacket and the currently detected packet of the same data flow may havedifferent sizes. Moreover, suppose that the rate limit for processingthe data packets of the data flow to maintain the requested CPUutilization level is 512 bytes/μsec. Provided that t₁<t₂, t₁=5 μsec, andt₂=7 μsec, the count of tokens that were returned to the bucket at timet₂ is ((t₂−t₁)*rate limit)=(2*512)=1024. The 1024 corresponds to thecount of tokens that were returned to the bucket by time t₂. Hence, bythe time t₂ when the current data packet is detected, 1024 tokens havebeen returned to the bucket. Thus, the bucket stores at least 1024tokens at time t₂.

Suppose that a current size of the bucket before detecting the currentdata packet at time t₂ was 1024. If, at time t₂, 1024 tokens have beenreturned to the bucket, then the current size of the bucket at time t₂is (1024+1024)=2048. That means that the bucket has 2048 tokens at timet₂.

Suppose that a capacity of the bucket is 1 M of tokens. Then, a minimumtaken from 1) the current size of the bucket at time t₂, and 2) thecapacity of the bucket is 2048 because 2048<1 M. The minimum of 1) thecurrent size of the bucket at time t₂, and 2) the capacity of the bucketis referred to as an adjusted current bucket size.

In step 240, the hypervisor determines whether the packet size of thedata packet detected at time t₂ exceeds the adjusted current bucket sizedetermined at time t₂. In the example described above, the adjustedcurrent bucket size at time t₂ is 2048, and the packet size of the datapacket detected at time t₂ is 1024 bytes. Thus, the packet size of thedata packet detected at time t₂ does not exceed the adjusted currentbucket size determined at time t₂.

If in step 250, it is determined that the packet size of the data packetdetected at time t₂ exceeds the adjusted current bucket size determinedat time t₂, then step 280 is performed. Otherwise, step 260 isperformed.

In step 260, the hypervisor reduces the current bucket size by thepacket size of the detected data packet.

In step 270, the hypervisor generates and returns a recommendation toaccept the detected data packet, and therefore to let the CPU dedicatedto processing the data flow to process the detected data packet.

Step 280 is performed if the hypervisor determined that the packet sizeof the data packet detected at time t₂ exceeds the adjusted currentbucket size determined at time t₂. Therefore, the bucket does not haveenough tokens at time t₂ to process the detected data packet whilethrottling the CPU utilization at, or below, the requested CPUutilization level. Thus, in step 280, the hypervisor generates andreturns a recommendation to drop the detected data packet, and thus, toomit the processing of the received data packet.

In step 290, the hypervisor updates the last arrival time to the arrivaltime of the detected data packet. The updating may include overwritingthe last arrival time stored in data repository 150 with the arrivaltime of the detected data packet. Referring again to the above example,the updating may include updating the last arrival time to t₂, which isthe time at which the current data packet was detected.

The above approach allows generating recommendations to either accept ordrop a detected data packet of the data flow for which a CPU throttlingrequest was received. If the data packet includes n bytes, and if thereare less than n tokens in the bucket, then the rate limiter recommendsthat the detected data packet be dropped because processing the detecteddata packet would require exceeding the CPU utilization level that auser or an administrator requested for the data flow.

However, if there are at least n tokens in the buckets, then the ratelimiter recommends that the detected data packet be accepted becauseprocessing the detected data packet would not require exceeding the CPUutilization level that a user or an administrator requested for the dataflow.

5. Example Pseudocode for Determining Whether to Accept of Drop aDetected Data Packet

FIG. 5 depicts pseudocode representing an algorithm for determiningwhether to accept or drop a detected packet based on a current bucketsize, an arrival time, a last arrival time, a packet size, and a ratelimit. In FIG. 5 , it is assumed that a CPU throttling request hasalready been received from a user or an administrator, and that therequest is for a data flow generated as certain applications areexecuted for the user or the administrator. The CPU throttling requestis a request for allocating no more than a CPU utilization level of theCPU resource to the data flow. The algorithm presented in FIG. 5 isdesigned to determine whether the data packet, which is detected at thearrival time, which has the packet size, and which belongs to the dataflow for which the CPU throttling request has been received, is to beaccepted or dropped so that the CPU utilization level specified in theCPU throttling request is not exceeded.

The pseudocode described in FIG. 5 is called a Rate limiter function510, and includes a call to a Rate_Limiter 520 and a set of instructions530.

Rate_Limiter 520 has several parameters, including acurrent_bucket_size, an arrival_time, a last_arrival_time, a rate_limit,and a bucket capacity. The parameters correspond to a current bucketsize, an arrival time, a last arrival time, a packet size, and a ratelimit, respectively. Rate_Limiter 520 returns the current_bucket_sizethat is computed as a minimum of 1) the current_bucket_size, and 2) thearrival_time−the last_arrival_time)*the rate_limit, thebucket_capacity).

If the current_bucket_size is greater, or equal, to the packet_size,then instructions 540 are performed. Otherwise, instructions 550 areperformed.

Instructions 530 are executed if the current_bucket_size is greater, orequal, to the packet_size. Instructions 530 include subtracting thepacket_size from the current_bucket_size, and returning a recommendationto accept the detected data packet.

Instructions 540 are executed if the current_bucket_size is lesser thanthe packet_size. Instructions 540 include returning a recommendation todrop the detected data packet.

After executing either instructions 530 or instructions 540, the last_arrival_ time is updated with the arrival_time of the detected packet.

Once the last_arrival_time is updated, the algorithm terminates.

6. Selecting a Default Mode or a CPU Throttling Mode

A default mode may be realized by executing a datapath processing in abest-effort fashion. The default mode is also referred to as abest-effort-mode. That means that when CPU resources are available, arequest for packet processing will be served without letting the CPU tobe unused.

In contrast, a CPU throttling mode is used to control an upper bound ofthe CPU resources that can be used to process data packets, so that theremaining portion of the CPU resources may remain available for otherapplication and system processes. The CPU throttling mode may betriggered upon receiving CPU throttling requests from theadministrators. in an embodiment, the administrators control whether toturn the CPU throttling mode on, or to turn it off.

An administrator may trigger a CPU throttling mode by issuing a CPUthrottling request. The decision whether to issue a CPU throttlingrequest may dependent on what and how the administrator wishes tocontrol. For example, if the traffic intensity generated by anapplication is high and CPU can potentially be exhausted by performingthe packet processing, the administrator may want to throttle the CPUutilization. To do so, the administrator may issue a CPU throttlingrequest. Issuing the CPU throttling request will impose a maximumallocation of CPU resources for data traffic processing, and theremaining CPU resources to other time-critical applications.

FIG. 3A is a block diagram for determining whether to allocate CPUresources to tasks based on a CPU throttling mode or a default mode. Instep 302, a hypervisor determines whether a CPU throttling request hasbeen received for a data flow. The request may specify the data flow towhich the CPU utilization throttling applies, and a CPU utilizationlevel that is desired, but need not be exceeded, for processing datapackets of the data flow.

If, in step 303, the hypervisor determines that the CPU throttlingrequest has been received for the data flow, then step 304 is performed.Otherwise, step 305 is performed.

In step 304, a CPU throttling mode is applied to processing data packetsthat belong to the data flow, and subsequently, the steps described inFIG. 2 are performed.

However, if the hypervisor determines that the CPU throttling requesthas not been received for the data flow, then, in step 305, the dataflow is processed in the best effort mode in which the CPU allocatesresources to the packet processing in the best-effort fashion.

7. Determining a Rate Limit in a CPU Throttling Mode

FIG. 3B is an example flow chart for determining a rate limit in a CPUthrottling mode. The depicted flow chart provides the details forcomputing the rate limit. The rate limit is used in step 220 of FIG. 2 .

For the purpose of explaining the steps in FIG. 3B, it is assumed that aCPU throttling request for a data flow has been received, and a datapacket of the data flow has been detected.

In step 310, a hypervisor extracts a CPU utilization level from a CPUthrottling request received from a user or an administrator for the dataflow. This may include processing a header of the packet, identifying apayload length in the header, and determining the packet size as a sumof a length of the packet header and the payload length.

A CPU utilization level extracted from the CPU throttling request isexpressed as a percentage value, and is usually less than 100%. The CPUutilization level is the maximum of the CPU utilization that the user orthe administrator requests to have allocated to processing of the user'sor the administrator's data flow. The CPU utilization level is used bythe hypervisor as a guidance to provide no more than the CPU utilizationlevel of the CPU resources to the data flow generated as for example, avirtual machine is executed.

In step 320, the hypervisor determines a packet size of the detecteddata packet. Also in this step, it is determined whether the detecteddata packet is encrypted or not.

Determining whether the packet is encrypted or not allows selecting aparticular table that may be used to determine the rate limit for thepacket. Typically, the rate limits for encrypted packets are differentthan the rates for the unencrypted packets. If the packet is encrypted,then a rate limit table for encrypted data packets is used. If thepacket is unencrypted, then a rate limit table for unencrypted datapacket is used. The reason for having two separate types of rate limittables is that the processing encrypted data packets is different thanthe processing unencrypted data packets, and therefore, the rates inwhich the encrypted data packets are processed are different from therates in which the unencrypted data packets are processed.

If it is determined, in step 330, that the detected packet is encrypted,then step 350 is performed. Otherwise, step 340 is performed.Determination of whether a packet is encrypted or not can be performedby checking the header. For example, a protocol number of 50 or 99 inthe IP header may indicate an encrypted packet.

Step 350 is performed when it is determined that the detected packet isencrypted. In this step, the hypervisor determines whether, for thepacket size of the detected packet and for the CPU utilization levelextracted from the CPU throttling request, an exact rate limit ispresent in the rate limit table for the encrypted data packets.

If, in step 352, it is determined that the exact rate limit is presentin the rate limit table for the encrypted data packets, then, in step354 the exact rate limit is extracted from the table.

FIG. 4 depicts an example rate limit table. The depicted rate limittable 420 is for encrypted data packets. A similar rate limit table maybe generated for unencrypted data packets as well.

Example rate limit table 420 is a data structure, such as data table,and is used to store rate limit values for different sizes of datapackets, and different CPU utilization levels. Example rate limit table420 may include, for example columns and rows: the columns may belabeled using CPU utilization levels, and rows may be labeled usingpacket sizes. Examples of packet sizes may include 512 bytes, 1024bytes, 2048 bytes, and so forth. Examples of CPU utilization levels mayinclude 10%, 20%, 30, 40%, and so forth. In the depicted example, a row422 includes the CPU utilization levels for encrypted data packets thatcontain 512 bytes, while a row 424 includes the CPU utilization levelsfor encrypted data packets that contain 1024 bytes. Information includedin example rate limit table 420 for encrypted data packets is providedto illustrate a clear example of the table, and not to correspond toactual traffic data.

Suppose that it is determined that a packet size if 512 bytes and therequested CPU utilization level is 30%. Example rate limit table 420 forencrypted data packets includes row 422 for that packet size, and row422 includes an entry 426 for that CPU utilization level. Therefore, forthe packet size of 512 bytes and the requested CPU utilization level of30%, the hypervisor may extract from entry 426 of example rate limittable 420 the value of 25.

Referring again to FIG. 3B, if, in step 352, it is determined that anexact rate limit is not present in the rate limit table for theencrypted data packet, then, in step 356, a rate limit is computed byapproximating two rate limits extracted from the rate limit table forthe encrypted data packets.

Suppose that it is determined that a packet size is 1024 bytes and therequested CPU utilization level is 25%. Example rate limit table 420 forencrypted data packets includes row 424 for that packet size. However,row 424 does not include an entry for that CPU utilization level. But,row 424 includes an entry 428 for 20% of CPU utilization level and anentry 429 for 30% of CPU utilization level. Therefore, for the packetsize of 1024 bytes and the requested CPU utilization level of 25%, thehypervisor may extract entries 428 and 429, compute an average valuebased on the entries 428-429, and use the computed average value as anapproximated rate limit. In this example, the approximated rate limit is30 bytes/μsec because ((25+35)/2)=30.

Step 340 is performed when it is determined that the detected packet isunencrypted. In this step, the hypervisor determines whether, for thepacket size of the detected packet and for the CPU utilization levelextracted from the CPU throttling request, an exact rate limit ispresent in a rate limit table for unencrypted data packets.

A rate limit table for unencrypted data packets may be a data structure,such as data table, and is used to store rate limit values for differentsizes of unencrypted data packets, and different CPU utilization levels.The table may include, for example columns and rows: the columns may belabeled using CPU utilization levels, and rows may be labeled usingpacket sizes. The rate limit table for unencrypted data packets may beorganized as a rate limit table for encrypted data packets, except itwill include rates that are specific to unencrypted data packets.

If in step 342 it is determined that an exact rate limit is present inthe rate limit table for unencrypted data packets, then, in step 344,the exact rate limit is extracted from the table. For example, if thepacket size is 512 bytes, the requested CPU utilization level is 20%,and the rate limit table for unencrypted data packets includes an entryfor that packet size and that CPU utilization level, then the exact ratelimit value may be extracted from the rate limit table for unencrypteddata packets.

However, if in step 342 it is determined that an exact rate limit isabsent from the rate limit table for unencrypted data packet, then, instep 346, a rate limit is computed by computing an average of two ratelimits extracted from the rate limit table for unencrypted data packets.For example, if the packet size is 512 bytes and the requested CPUutilization level is 25%, and the rate limit table for the unencrypteddata packets includes one entry for that packet size and the CPUutilization level of 20%, and another entry for that packet size and theCPU utilization level of 30%, then an approximated rate limit value maybe computed as an average value of the two rates.

In step 360, the determined rate limit is returned. The determined ratelimit may either the exact rate limit determined in step 344 or step354, or the approximated rate limit computed in step 346 or step 356.The determined rate limit may be used in step 230 of FIG. 2 to determinean adjusted current bucket size.

8. Improvements Provided by Certain Embodiments

In an embodiment, the techniques described herein allow throttling theCPU utilization for data flow processing so that both, the data flow andother tasks/applications receive some CPU resources. The throttling maybe initiated when a control plane or a hypervisor receives a CPUthrottling request from a user or an administrator. The CPU throttlingrequest specifies a CPU utilization level that is not to be exceeded bythe data flow processing.

A CPU utilization level specified in a CPU throttling requestcorresponds to a desired CPU utilization level for processing the datatraffic. The desired CPU utilization level included in the request isusually less than 100% so that the remaining portion of the CPUresources (not allocated for processing the data traffic) may beallocated to other tasks and applications.

To guarantee a desired CPU utilization level for the data traffic, ahypervisor determines, on a per-packet-basis, whether a detected packetis to be dropped or transmitted toward its destination. For example, iftransmitting a detected data packet toward its destination would requireexceeding the desired CPU utilization level requested by a user or anadministrator for a data flow, then the hypervisor may recommenddropping the packet. Otherwise, the data packet may be transmittedtoward its destination.

The approach may be implemented along any part of a datapath of the datapacket communications. For example, the approach may be implemented at apoint of the datapath that is between a virtual machine and a virtualswitch. The approach may also be implemented along the datapath that isinternally processed by the virtual switch. Furthermore, the approachmay be implemented at a point of the datapath that is between hardwarecomponents and a hypervisor. Therefore, the CPU utilization may bethrottled along different portion of the datapath for the data flow.

9. Implementation Mechanisms

The present approach may be implemented using a computing systemcomprising one or more processors and memory. The one or more processorsand memory may be provided by one or more hardware machines. A hardwaremachine includes a communications bus or other communication mechanismsfor addressing main memory and for transferring data between and amongthe various components of hardware machine. The hardware machine alsoincludes one or more processors coupled with the bus for processinginformation. The processor may be a microprocessor, a system on a chip(SoC), or other type of hardware processor.

Main memory may be a random-access memory (RAM) or other dynamic storagedevice. It may be coupled to a communications bus, and used for storinginformation and software instructions to be executed by a processor.Main memory may also be used for storing temporary variables or otherintermediate information during execution of software instructions to beexecuted by one or more processors.

10. General Considerations

Although some of various drawings may illustrate a number of logicalstages in a particular order, stages that are not order dependent may bereordered and other stages may be combined or broken out. While somereordering or other groupings may be specifically mentioned, others willbe obvious to those of ordinary skill in the art, so the ordering andgroupings presented herein are not an exhaustive list of alternatives.Moreover, it should be recognized that the stages could be implementedin hardware, firmware, software or any combination thereof.

The foregoing description, for purpose of explanation, has beendescribed regarding specific embodiments. However, the illustrativeembodiments above are not intended to be exhaustive or to limit thescope of the claims to the precise forms disclosed. Many modificationsand variations are possible in view of the above teachings. Theembodiments were chosen to best explain the principles underlying theclaims and their practical applications, to thereby enable othersskilled in the art to best use the embodiments with variousmodifications as are suited to the uses contemplated.

In the foregoing specification, embodiments of the approach have beendescribed with reference to numerous specific details that may vary fromimplementation to implementation. The specification and drawings are,accordingly, to be regarded in an illustrative rather than a restrictivesense. The sole and exclusive indicator of the scope of the approach,and what is intended by the applicants to be the scope of the approach,is the literal and equivalent scope of the set of claims that issue fromthis application, in the specific form in which such claims issue,including any subsequent correction.

Any definitions set forth herein for terms contained in the claims maygovern the meaning of such terms as used in the claims. No limitation,element, property, feature, advantage or attribute that is not expresslyrecited in a claim should limit the scope of the claim in any way. Thespecification and drawings are to be regarded in an illustrative ratherthan a restrictive sense.

As used herein the terms “include” and “comprise” (and variations ofthose terms, such as “including,” “includes,” “comprising,” “comprises,”“comprised” and the like) are intended to be inclusive and are not toexclude further features, components, integers or steps.

References in this document to “an embodiment,” indicate that theembodiment described or illustrated may include a particular feature,structure, or characteristic, but every embodiment may not necessarilyinclude the particular feature, structure, or characteristic. Suchphrases are not necessarily referring to the same embodiment. Further,when a particular feature, structure, or characteristic is described orillustrated in connection with an embodiment, it is believed to bewithin the knowledge of one skilled in the art to affect such feature,structure, or characteristic in connection with other embodimentswhether or not explicitly indicated.

Various features of the disclosure have been described using processsteps. The functionality/processing of a given process step couldpotentially be performed in different ways and by different systems orsystem modules. Furthermore, a given process step could be divided intomultiple steps and/or multiple steps could be combined into a singlestep. Furthermore, the order of the steps can be changed withoutdeparting from the scope of the present disclosure.

It will be understood that the embodiments disclosed and defined in thisspecification extend to alternative combinations of the individualfeatures and components mentioned or evident from the text or drawings.These different combinations constitute various alternative aspects ofthe embodiments.

What is claimed is:
 1. A method for throttling CPU utilization by data flows in a computer virtual network, comprising: receiving, by a hypervisor, a CPU utilization throttling request from a control plane or a user for a data flow, the request comprising a CPU utilization level and an identifier of the data flow; detecting, at a certain point along a datapath, a data packet that belongs to the data flow identified by the identifier of the data flow; determining a packet size of the data packet; accessing a rate limit table, which stores associations between packet sizes, rate limit values, and CPU utilization levels that define a percentage of a CPU dedicated for processing data packets; determining, for the data packet, a rate limit value that indicates a speed at which the data packet is processed based at least on the CPU utilization level and the packet size, wherein determining the rate limit value comprises: determining whether the rate limit table includes a rate limit for the packet size of the packet: when the rate limit table does not include a rate limit value for the packet size of the packet: computing an approximate rate limit from one or more utilization level entries; and returning the approximate rate limit as the rate limit value; when the rate limit table includes the rate limit for the packet size of the packet, retuning the rate limit as the rate limit value; determining, based at least in part on an outcome of a rate limiter function applied to the rate limit value, whether the CPU utilization level for the data flow would be exceeded if the data packet is transmitted toward its destination; in response to determining that the CPU utilization level for the data flow would be exceeded if the data packet is transmitted toward its destination, generating a recommendation to drop the data packet, causing the data packet to be dropped; and in response to determining that the CPU utilization level for the data flow would not be exceeded if the data packet is transmitted toward its destination, generating a recommendation to allow a transmission of the data packet.
 2. The method of claim 1, wherein determining, based on the rate limit value, whether the CPU utilization level for the data flow would be exceeded if the data packet is transmitted toward its destination comprises: determining an arrival time at which the data packet was detected at the certain point of the datapath; determining a last arrival time at which a previous data packet was detected; determining a current bucket size and a bucket capacity of a bucket associated with a CPU resource dedicated to a processing of the data flow; wherein the current bucket size indicates a size of the bucket at the arrival time; wherein the bucket capacity indicates a maximum size of the bucket; computing a count of tokens that have been returned to the bucket by the arrival time; wherein computing the count of tokens comprises determining a product of the rate limit value and a difference between the last arrival time at which a previous data packet was detected and the arrival time at which the data packet was detected; computing an updated current bucket size by adding the count of tokens to the current bucket size; determining an adjusted current bucket size by selecting a minimum of the bucket capacity and the updated current bucket size; determining whether the packet size exceeds the adjusted current bucket size; and in response to determining that the packet size exceeds the adjusted current bucket size, determining that the CPU utilization level for the data flow would be exceeded if the data packet is transmitted toward its destination.
 3. The method of claim 2, further comprising: in response to determining that the packet size does not exceed the adjusted current bucket size, determining that the CPU utilization level for the data flow would not be exceeded if the data packet is transmitted toward its destination.
 4. The method of claim 1, further comprising: in response to determining that an exact rate limit is absent from the rate limit table for the data packet, identifying, in the rate limit table, the two utilization level entries that are indexed using the packet size and that are similar the CPU utilization level in the request.
 5. The method of claim 1, wherein accessing the rate limit table to determine, based on the CPU utilization level and the packet size, the rate limit value for the data packet comprises: determining whether the data packet is encrypted; and in response to determining that the data packet is encrypted, accessing the rate limit table for encrypted data packets, to determine the rate limit value from the rate limit table for the encrypted data packets, and based on the CPU utilization level and the packet size.
 6. The method of claim 1, wherein the identifier of the data flow is a value indicating that all data packet processing is included.
 7. One or more non-transitory computer-readable storage media storing one or more computer instructions for throttling CPU utilization by data flows in a computer virtual network which, when executed by one or more processors, cause the one or more processors to perform: receiving, by a hypervisor, a CPU utilization throttling request from a control plane or a user for a data flow, the request comprising a CPU utilization level and an identifier of the data flow; detecting, at a certain point along a datapath, a data packet that belongs to the data flow identified by the identifier of a data flow; determining a packet size of the data packet; accessing a rate limit table, which stores associations between packet sizes, rate limit values, and CPU utilization levels that define a percentage of a CPU dedicated for processing data packets; determining, for the data packet, a rate limit value that indicates a speed at which the data packet is processed based at least on the CPU utilization level and the packet size, wherein determining the rate limit value comprises: determining whether the rate limit table includes a rate limit for the packet size of the packet: when the rate limit table does not include a rate limit value for the packet size of the packet: computing an approximate rate limit from one or more utilization level entries; and returning the approximate rate limit as the rate limit value; when the rate limit table includes the rate limit for the packet size of the packet, retuning the rate limit as the rate limit value; determining, based at least in part on an outcome of a rate limiter function applied to the rate limit value, whether the CPU utilization level for the data flow would be exceeded if the data packet is transmitted toward its destination; in response to determining that the CPU utilization level for the data flow would be exceeded if the data packet is transmitted toward its destination, generating a recommendation to drop the data packet; and in response to determining that the CPU utilization level for the data flow would not be exceeded if the data packet is transmitted toward its destination, generating a recommendation to allow a transmission of the data packet.
 8. The one or more non-transitory computer-readable storage media of claim 7, storing additional instructions which, when executed by the one or more processors, cause the one or more processors to perform: determining an arrival time at which the data packet was detected at the certain point of the datapath; determining a last arrival time at which a previous data packet was detected; determining a current bucket size and a bucket capacity of a bucket associated with a CPU resource dedicated to a processing of the data flow; wherein the current bucket size indicates a size of the bucket at the arrival time; wherein the bucket capacity indicates a maximum size of the bucket; computing a count of tokens that have been returned to the bucket by the arrival time; wherein computing the count of tokens comprises determining a product of the rate limit value and a difference between the last arrival at which a previous data packet was detected and the arrival time at which the data packet was detected; computing an updated current bucket size by adding the count of tokens to the current bucket size; determining an adjusted current bucket size by selecting a minimum of the bucket capacity and the updated current bucket size; determining whether the packet size exceeds the adjusted current bucket size; and in response to determining that the packet size exceeds the adjusted current bucket size, determining that the CPU utilization level for the data flow would be exceeded if the data packet is transmitted toward its destination.
 9. The one or more non-transitory computer-readable storage media of claim 8, storing additional instructions which, when executed by the one or more processors, cause the one or more processors to perform: in response to determining that the packet size does not exceed the adjusted current bucket size, determining that the CPU utilization level for the data flow would not be exceeded if the data packet is transmitted toward its destination.
 10. The one or more non-transitory computer-readable storage media of claim 7, storing additional instructions which, when executed by the one or more processors, cause the one or more processors to perform: in response to determining that an exact rate limit is absent from the rate limit table for the data packet, identifying, in the rate limit table, the two utilization level entries that are indexed using the packet size and that are similar the CPU utilization level in the request.
 11. The one or more non-transitory computer-readable storage media of claim 7, storing additional instructions which, when executed by the one or more processors, cause the one or more processors to perform: determining whether the data packet is encrypted; in response to determining that the data packet is encrypted, accessing the rate limit table for encrypted data packets, to determine the rate limit value from the rate limit table for the encrypted data packets, and based on the CPU utilization level and the packet size; and in response to determining that the data packet is unencrypted, accessing the rate limit table for unencrypted data packets, to determine the rate limit value from the rate limit table for the unencrypted data packets, and based on the CPU utilization level and the packet size.
 12. The one or more non-transitory computer-readable storage media of claim 7, wherein the identifier of the data flow is a value indicating that all data packet processing is included.
 13. A hypervisor implemented in a host computer and configured to implement a rule processing and enforcement for interleaved Layer 4, Layer 7 and verb-based rulesets, the hypervisor comprising: one or more processors; one or more memory units; and one or more non-transitory computer-readable storage media storing one or more computer instructions for throttling CPU utilization by data flows in a computer virtual network which, when executed by the one or more processors, cause the one or more processors to perform: receiving, by the hypervisor, a CPU utilization throttling request from a control plane or a user for a data flow, the request comprising a CPU utilization level and an identifier of the data flow; detecting, at a certain point along a datapath, a data packet that belongs to the data flow identified by the identifier of a data flow; determining a packet size of the data packet; accessing a rate limit table, which stores associations between packet sizes, rate limit values, and CPU utilization levels that define a percentage of a CPU dedicated for processing data packets; determining, for the data packet, a rate limit value that indicates a speed at which the data packet is processed based at least on the CPU utilization level and the packet size, wherein determining the rate limit value comprises: determining whether the rate limit table includes a rate limit for the packet size of the packet: when the rate limit table does not include a rate limit value for the packet size of the packet: computing an approximate rate limit from one or more utilization level entries; and returning the approximate rate limit as the rate limit value; when the rate limit table includes the rate limit for the packet size of the packet, retuning the rate limit as the rate limit value; determining, based at least in part on an outcome of a rate limiter function applied to the rate limit value, whether the CPU utilization level for the data flow would be exceeded if the data packet is transmitted toward its destination; in response to determining that the CPU utilization level for the data flow would be exceeded if the data packet is transmitted toward its destination, generating a recommendation to drop the data packet, causing the data packet to be dropped; and in response to determining that the CPU utilization level for the data flow would not be exceeded if the data packet is transmitted toward its destination, generating a recommendation to allow a transmission of the data packet.
 14. The hypervisor of claim 13, wherein the one or more non-transitory computer-readable storage media store additional instructions which, when executed by the one or more processors, cause the one or more processors to perform: determining an arrival time at which the data packet was detected at the certain point of the datapath; determining a last arrival time at which a previous data packet was detected; determining a current bucket size and a bucket capacity of a bucket associated with a CPU resource dedicated to a processing of the data flow; wherein the current bucket size indicates a size of the bucket at the arrival time; wherein the bucket capacity indicates a maximum size of the bucket; computing a count of tokens that have been returned to the bucket by the arrival time; wherein computing the count of tokens comprises determining a product of the rate limit value and a difference between the last arrival time at which a previous data packet was detected and the arrival time at which the data packet was detected; computing an updated current bucket size by adding the count of tokens to the current bucket size; determining an adjusted current bucket size by selecting a minimum of the bucket capacity and the updated current bucket size; determining whether the packet size exceeds the adjusted current bucket size; and in response to determining that the packet size exceeds the adjusted current bucket size, determining that the CPU utilization level for the data flow would be exceeded if the data packet is transmitted toward its destination.
 15. The hypervisor of claim 14, wherein the one or more non-transitory computer-readable storage media store additional instructions which, when executed by the one or more processors, cause the one or more processors to perform: in response to determining that the packet size does not exceed the adjusted current bucket size, determining that the CPU utilization level for the data flow would not be exceeded if the data packet is transmitted toward its destination.
 16. The hypervisor of claim 13, wherein the one or more non-transitory computer-readable storage media store additional instructions which, when executed by the one or more processors, cause the one or more processors to perform: in response to determining that an exact rate limit is absent from the rate limit table for the data packet, identifying, in the rate limit table, the two utilization level entries that are indexed using the packet size and that are similar the CPU utilization level in the request.
 17. The hypervisor of claim 13, wherein the one or more non-transitory computer-readable storage media store additional instructions which, when executed by the one or more processors, cause the one or more processors to perform: determining whether the data packet is encrypted; in response to determining that the data packet is encrypted, accessing the rate limit table for encrypted data packets, to determine the rate limit value from the rate limit table for the encrypted data packets, and based on the CPU utilization level and the packet size; and in response to determining that the data packet is unencrypted, accessing the rate limit table for unencrypted data packets, to determine the rate limit value from the rate limit table for the unencrypted data packets, and based on the CPU utilization level and the packet size.
 18. The hypervisor of claim 13, wherein the identifier of the data flow is a value indicating that all data packet processing is included. 